%load_ext pretty_jupyter
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from plotly.offline import init_notebook_mode
init_notebook_mode()
import cpi
# cpi.update()
Preamble¶
Recently, I was made aware that there isn't really much evidence of a relationship between spending and academic achievement in public K-12 education. This intrigued me and this document aims to investigate the topic, albeit, with the limitation that I'm only working with data that's publicly available. Herein, I'll be looking at the question simply at the state-level. I plan to continue the investigation in future documents, using both district-level and school-level data and a multilevel/hierarchical analysis in the vein of Andrew Gelman.
My conclusion? There is little evidence that there has been any improvement in academic achievement in public K-12 education in Utah as a whole since at least the 90's. Moreover, this is in light of the fact that over the same time period there has been a dramatic increase in inflation adjusted, per student spending.
I've looked at four measures of academic achievement: graduation rate, ACT scores, proficiency tests, and student growth metrics. In each case, the measure either shows no improvement or is fundamentally deficient as a longitudinal statistic. Perhaps other measures of achievement exist but they are either 1) not publicly available or 2) are reported inconsistently over short time spans.
A few little things:
- This is an html version of a Jupyter notebook. Code cells are made visible by clicking the 'code' buttons on the right.
- Data sources are documented at the end of this article.
- When school years are noted, I typically only provide the first year (e.g., 2015 denotes the 2015/2016 school year).
##################
# DATA IMPORT CELL
##################
expenditure_pre = pd.read_csv('data/ccd_bespoke/elsi_expenditure_te5.csv', nrows=1)
expenditure = expenditure_pre.transpose().reset_index(drop=True).rename(columns={0: 'unadjusted'})
expenditure['year'] = range(2021,1985,-1)
expenditure['adjusted'] = expenditure.apply(lambda x: cpi.inflate(x['unadjusted'],
x['year'],
area="West"),
axis=1)
expenditure_per_student_pre = pd.read_csv('data/ccd_bespoke/elsi_expenditure_per_student_te5.csv', nrows=1)
expenditure_per_student = expenditure_per_student_pre.transpose(). \
reset_index(drop=True).rename(columns={0: 'unadjusted'})
expenditure_per_student['year'] = range(2021,1985,-1)
expenditure['per_student'] = expenditure_per_student.apply(lambda x: cpi.inflate(x['unadjusted'],
x['year'],
area="West"),
axis=1)
################
enrollment = pd.read_csv('data/ccd_bespoke/elsi_total_enrollment.csv', nrows=1)
enrollment = enrollment.transpose().reset_index(drop=True).rename(columns={0: 'totalEnrollment'})
enrollment['year'] = range(2022,1985,-1)
################
nonfiscal = pd.read_csv("data/ccd_nonfiscal/nonfiscal.csv", dtype={'ST_LEAID': 'category',
'LEAID': 'category',
'ST_SCHID': 'category',
'STUDENT_COUNT': 'Int32',
'SCHID': 'category'})
################
usbe = pd.read_csv("data/usbe/HistoricalGraduationRates.csv",
index_col=0).transpose().rename(columns={'State of Utah': 'gradRate'})
usbe.index = pd.to_numeric(usbe.index)
################
diplomas = pd.read_csv("data/ccd_bespoke/elsi_diploma_recipients.csv",
nrows=1).transpose().rename(columns={0: 'graduates'}).reset_index(drop=True)
diplomas['year'] = range(2022,1985,-1)
################
grade_8 = pd.read_csv("data/ccd_bespoke/elsi_grade_8_enrollment.csv",
nrows=1).transpose().rename(columns={0: 'enrollment'}).reset_index(drop=True)
grade_8['year'] = range(2022,1985,-1)
grade_9 = pd.read_csv("data/ccd_bespoke/elsi_grade_9_enrollment.csv",
nrows=1).transpose().rename(columns={0: 'enrollment'}).reset_index(drop=True)
grade_9['year'] = range(2022,1985,-1)
grade_10 = pd.read_csv("data/ccd_bespoke/elsi_grade_10_enrollment.csv",
nrows=1).transpose().rename(columns={0: 'enrollment'}).reset_index(drop=True)
grade_10['year'] = range(2022,1985,-1)
grade_11 = pd.read_csv('data/ccd_bespoke/elsi_grade_11_enrollment.csv',
nrows=1).transpose().rename(columns={0: 'enrollment'}).reset_index(drop=True)
grade_11['year'] = range(2023,1985,-1)
################
act = pd.DataFrame({'year': range(2000,2024),
'utahParticipants': [22103, 21010, 21007, 20856, 20593, 21561, 22008,
22598, 23229, 24824, 25161, 32835, 34514, 35074,
40629, 41446, 42580, 43791, 43790, 44446, 39724,
43125, 43645, 44550],
'utah_score': [21.4, 21.4, 21.3, 21.5, 21.5, 21.7, 21.7, 21.8, 21.8,
21.8, 21.8, 20.7, 20.7, 20.8, 20.2, 20.2, 20.3, 20.4,
20.3, 20.2, 20.6, 19.9, 19.9, 20.0],
'national_score': [None, 20.8, 20.8, 20.9, 20.9, 21.1, 21.2, 21.1,
21.1, 21.0, 21.1, 21.1, 20.9, 21.0, 21.0, 20.8,
21.0, 20.8, 20.7, 20.6, 20.3, 19.8, 19.5, 19.4]})
act = act.merge(grade_11, how='inner', on='year')
act['participation'] = act['utahParticipants'] / act['enrollment']
################
pace = pd.read_csv("data/pace/pace.csv", dtype={'LEA': 'category',
'LEANumber': 'category',
'SchoolNumber': 'category',
'ST_SCHID': 'category'})
################
rank = pd.read_csv("data/rank/rank_with_ST_SCHID.csv", dtype={'SchoolNumber': 'category'})
################
grades = pd.read_csv("data/grades/grades.csv", dtype={'LEA Number': 'category',
'SchoolNumber': 'category'})
Spending Trends¶
At the state level, there's been a consistent increase in total spending since 1986. In 1986, total spending was \$2.7 billion and increased to \$7.4 billion in 2021 (all dollar amounts are inflation-adjusted to 2024 dollars).
fig = go.Figure()
fig.add_trace(go.Scatter(x=expenditure['year'],
y=expenditure['adjusted'],
mode='lines',
))
fig.update_yaxes(title='2024 dollars',
tickprefix="$")
fig.update_layout(title='Total State Public Ed Spending',
title_x=0.5,
title_font_size=24,
width=700,
height=500
)
fig.show()
There was, of course, a concomitant increase in enrollment numbers, increasing from 416,000 students to 691,000 students over the same years.
fig = go.Figure()
fig.add_trace(go.Scatter(x=enrollment['year'],
y=enrollment['totalEnrollment'],
mode='lines',
))
fig.update_yaxes(title='students')
fig.update_layout(title='Total State Enrollment',
title_x=0.5,
title_font_size=24,
width=700,
height=500
)
fig.show()
You do the math and you get the spending per student numbers shown below. It's a notable increase from \$6,570 per student in 1986 to \$10,696 in 2021. That's a 63% increase in inflation adjusted dollars.
fig = go.Figure()
fig.add_trace(go.Scatter(x=expenditure['year'],
y=expenditure['per_student'],
mode='lines',
))
fig.update_yaxes(title='2024 dollars',
tickprefix="$")
fig.update_layout(title='Spending per Student',
title_x=0.5,
title_font_size=24,
width=700,
height=500
)
fig.show()
The CCD expenditure data only goes to the 2021/22 school year and I'm unsure of an equivalent figure in state budget reports but it seems unlikely to me that the number has decreased. According to the 2024 Utah Budget1, total spending was \$7.7 billion (caveat emptor, I'm not an accountant). Enrollment in Utah public schools was 672,662 for the 2023/24 school year. That works out to per-student spending of \$11,476.
Regardless of the details that I've touched on in this section, we can be sure that spending has increased by just about any reasonable metric since 1986.
Achievement Trends¶
Let's look at 4 measures of academic achievement and see if they have a relationship with spending per student:
1. Graduation Rate¶
There are two primary methods for calculating graduation rates: adjusted cohort graduation rate (ACGR)2 and averaged freshman graduation rate (AFGR)3. The ACGR is the more accurate of the two but cannot be calculated from publicly available data and has only been published by the USBE since 2008. Additionally, as we'll see below, 2008 is an unfortunate year to begin publishing a graduation rate for Utah. The AFGR, on the other hand, can be calculated with publicly available data over many years (since 1990 here).
A simple graph of the ACGR from the USBE shows a meteoric rise in graduation rates, going from 69% in 2008 to 88% in 2023:
fig = go.Figure()
fig.add_trace(go.Scatter(x=usbe.index,
y=usbe['gradRate']*100,
mode='lines',
name='AC Graduation Rate (USBE)'))
fig.update_yaxes(ticksuffix="%")
fig.update_layout(title="ACGR published by USBE",
yaxis_title="Graduation rate",
title_x=0.5,
title_font_size=24,
width=700,
height=500)
fig.show()
But there's something that bothers me looking at this graph. It appears that 2008 was a year of peak growth in graduation rates, as if one would see even lower graduation rates if one were to extend the graph to years prior to 2008. But were graduation rates really below 69% in the early 2000s? That does not feel true. So what were graduation rates like historically?
We can't use the ACGR to answer that question because the USBE wasn't tracking students in a way to calculate it prior to 2008. However, we can calculate the AFGR all the way back to 1990. And plotting the AFGR is rather illuminating:
diplomas['afgr'] = diplomas['graduates'] / ((grade_8['enrollment'].shift(-4) +
grade_9['enrollment'].shift(-3) +
grade_10['enrollment'].shift(-2)) / 3)
fig = go.Figure()
fig.add_trace(go.Scatter(x=diplomas['year'],
y=diplomas['afgr']*100,
mode='lines',
name='AF Graduation Rate (calculated)'))
fig.update_yaxes(ticksuffix="%")
fig.update_layout(title="AFGR calculated from NCES CCD",
yaxis_title="Graduation rate",
title_x=0.5,
title_font_size=24,
width=700,
height=500)
fig.show()
This paints a very different picture. What we see is that graduation rates in Utah have historically hovered around 84%, experienced a sizable dip4 from about 2005 to 2012, and then have risen to around 89% in the 2020s. That is, over a 30 year period, there has been an inconsistent and modest rise in the graduation rate from 84% to 89% while spending per student has risen dramatically. Over this same period, there has been a national rise in graduation rates (AFGR) from approximately 73% in the 1990s to 87% in 2018 basically eroding most of the lead Utah once had5.
It appears that the USBE has been pretty inconsistent in reporting the graduation rate. If you google the topic over the years where the dip is seen (search for something like "2008 Utah graduation rate") you'll find articles addressing the issue. For example:
- https://www.deseret.com/2010/12/1/20157294/report-says-utah-dropout-rate-bucks-national-trend/
- https://archive.sltrib.com/article.php?id=51962015&itype=CMSID
- https://www.ksl.com/article/18572796/utah-high-school-graduation-drops-to-75-percent-still-beats-national-average
One thing I find disturbing is that the USBE (called the Office of Eduation at the time) seems to contradict it's own publications for graduation rate. This could possibly be a consequence of having a myriad of ways for calculating a graduation rate. Discussing them comprehensively extends beyond the scope of this article but one thing is certain: there are about as many methods for calculating the graduation rate as there are ski runs in Utah. And this is a huge problem. Because leaders can choose whatever method they like whenever they want and slap the label of "Official Graduation Rate" when in reality they're just cherry-picking a definition that presents a flattering message at that moment.
2. ACT scores¶
From 2000 to 2010 the average ACT score in Utah rose from 21.4 to 21.8 (max score on the ACT is 36) before entering a period of decline, reaching a mean score of 20.0 in 2023. In 2010 there was an initiative to give all Utah students access to take the ACT. This resulted in a marked increase in participation rates which, in turn, resulted in a lower mean score6. When plotted with participation rates, mean scores appear almost to be perfectly inversely related to participation rates.
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
go.Scatter(x=act['year'], y=act['utah_score'], mode='lines', name='Mean ACT Score (Utah)'),
secondary_y=False
)
fig.add_trace(
go.Scatter(x=act['year'], y=act['participation']*100, mode='lines', name='participation rate'),
secondary_y=True
)
fig.update_layout(title='Mean ACT Score vs Participation',
title_x=0.5,
title_font_size=24,
width=800,
height=600
)
fig.update_yaxes(title_text="<b>ACT Score</b>",
ticksuffix="",
showgrid=False,
secondary_y=False)
fig.update_yaxes(title_text="<b>% of students</b>",
ticksuffix="%",
secondary_y=True)
fig.show()
Participation rates began to increase similarly in other states after 2010, explaining a drop in the mean national ACT score. However, national scores have continued to drop even since 2020 while participation rates have begun to decline; presumably, a consequence of COVID. While Utah's mean score has declined, it has declined at a slower rate and the mean score for Utah in 2023 is .6 points higher than the national average.
At the risk of sounding pedantic, I would like to note that declining at a rate slower than average is still declining. There is no evidence that ACT scores are improving. And I again would like to remind the reader that the drop in mean ACT scores in Utah is accompanied by significant increases in per student spending.
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
go.Scatter(x=act['year'], y=act['utah_score'], mode='lines', name='Mean ACT Score (Utah)'),
secondary_y=False
)
fig.add_trace(go.Scatter())
fig.add_trace(
go.Scatter(x=act['year'], y=act['national_score'], mode='lines', name='Mean ACT Score (National)'),
secondary_y=False
)
fig.update_layout(title='Mean ACT Score - Utah vs National',
title_x=0.5,
title_font_size=24,
width=800,
height=600
)
fig.update_yaxes(title_text="<b>ACT Score</b>")
fig.show()
3. Proficiency¶
There isn't much documentation about proficiency on the USBE website but it's described as
an academic achievement indicator for all schools ... based on annual statewide administration of a standards-based assessment for each respective grade span.7
It isn't stated on the USBE website, but it's a fair assumption that the method for calculating proficiency must have changed from 2012 to 2013, given that there is such an enormous drop followed by relatively constant proficiency numbers. The methodology for testing proficiency is also not detailed on the USBE website so any perceived trends should be met with some caution.
Again, proficiency stays relatively constant while per-student spending increases.
# merge rank and grades datasets
rank_prof = rank.loc[~(rank['YEAR'] == 2016), ['YEAR',
'LEA',
'SchoolName',
'SchoolNumber',
'ST_SCHID',
'StudentCount',
'PercentAchievementScore']]
rank_prof = rank_prof.rename(columns = {'PercentAchievementScore': 'proficient percent'})
grades_prof = grades[['YEAR',
'LEA',
'SchoolName',
'LEA Number',
'SchoolNumber',
'ST_SCHID',
'ELA Proficient',
'ELA Proficient Possible',
'Math Proficient',
'Math Proficient Possible',
'SC Proficient',
'SC Proficient Possible']].copy()
grades_prof['proficient'] = grades_prof[['ELA Proficient',
'Math Proficient',
'SC Proficient']].sum(axis=1)
grades_prof['proficient possible'] = grades_prof[['ELA Proficient Possible',
'Math Proficient Possible',
'SC Proficient Possible']].sum(axis=1)
grades_prof['proficient percent'] = grades_prof['proficient'] / grades_prof['proficient possible']
proficiency = rank_prof.merge(grades_prof, how='outer', on=['YEAR', 'ST_SCHID', 'proficient percent'])
# make a weighted average of proficiency
prof_merge = proficiency.merge(nonfiscal[['ST_SCHID', 'YEAR', 'STUDENT_COUNT']], how='left', on=['ST_SCHID', 'YEAR'])
prof_merge['weighted_prof'] = prof_merge['proficient percent'] * prof_merge['STUDENT_COUNT']
prof_merge = prof_merge[['YEAR', 'STUDENT_COUNT', 'weighted_prof']].groupby('YEAR').sum()
prof_merge = prof_merge['weighted_prof'] / prof_merge['STUDENT_COUNT']
exp_prof = expenditure[expenditure['year'] > 2009]
# plot
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
go.Scatter(x=prof_merge.index, y=prof_merge*100, mode='lines', name='Percent of students proficient'),
secondary_y=False
)
fig.add_trace(
go.Scatter(x=exp_prof['year'], y=exp_prof['per_student'], mode='lines', name='Spending per Student'),
secondary_y=True
)
fig.update_layout(title='Percent of Students Proficient',
title_x=0.5,
title_font_size=24,
width=800,
height=600
)
fig.update_yaxes(title_text="<b>Percent of students proficient</b>",
ticksuffix="%",
showgrid=False,
secondary_y=False)
fig.update_yaxes(title_text="<b>Spending per student</b>",
tickprefix="$",
secondary_y=True)
fig.show()
4. Growth¶
What a mess. The growth metric is defined as an indicator that measures
the rate of increase in students’ academic progress, regardless of their present level of proficiency, over time.8
In the grades dataset (2012-2016), this is reported as an average percentage of possible growth. Then, in the rank dataset (2016-2023) it's reported as an average student growth percentile (SGP) "score". Moreover, in 2021, the SGP score calculation changes via a new method for allotting "index points"9. So, in reality, there are three different calculations for growth-like metrics over three different periods: 2012-2016, 2016-2020, and 2021-2023.
Reporting growth as a SGP is great for comparing individual students within the same year but at every level of increased aggregation (schools, districts, state) SGP begins to lose its usefulness because of the nature of percentiles. In the extreme case, which we are presently concerned with, realize that the average state SGP should always be 50%, which means it has no information. SGPs are of limited value evaluating trends over time and are of no value in evaluating the effectiveness of education over time at the state level.
But a measure of growth (not growth percentiles) would be useful at the state level. In an earlier draft, I included a plot of growth scores from 2012 to 2016 but have removed it given that it's such a short time span and no longer current.
I feel that the USBE is doing a disservice to the public in how they're reporting growth metrics. SGPs provide no usable information for assessing trends in K-12 education at the state level and even at the school level, they change the method of allotting "index points" so the SGP "score" is inconsistent. Furthermore, even from 2012 to 2016 when they report objective growth scores, there is no outline of testing methodology, as we saw earlier with proficiency.
Conclusion¶
Basically, there's no evidence that there's a relationship between spending and academic achievement at the state level, given the data that's publicly available. However, all analysis is limited by the data. Fiscal and enrollment data goes back much farther but ACT scores are only available from 2000 and USBE achievement data is only published after 2012. Moreover, the student growth and proficiency data is poorly documented and inconsistent. These issues should likely be addressed.
Data Sources¶
I've used an assortment of data sources, mostly from the National Center for Education Statistics (NCES) Common Core of Data (CCD), the Utah State Board of Education (USBE), and ACT Inc. publications. They are as follows:
expenditurecontains yearly expenditure figures from NCES CCD. There are multiple variables that could be used (for example TE11 + E4D + E7A1 or TR) but we're using TE5 here which constitutes most of what could be considered "current expenditure" but does not include payment on interest, adult education, and a few other items (again, I'm not an accountant). Regardless of the measure of spending (any variable from the CCD or data from Utah budget documents), they all basically tell the same story: large and consistent spending increases.enrollmentcontains total enrollment numbers from NCES CCD. This excludes adult education students.nonfiscalcontains school-level enrollment numbers. Obtained from NCES CCD.usbecontains graduation rates. Obtained here: https://www.schools.utah.gov/datastatistics/_datastatisticsfiles_/_reports_/_graduationdropoutrates_/HistoricalGraduationRates.xlsx. The data for years 2022 and 2023 were found in publications from USBE: https://schools.utah.gov/superintendentannualreport/dataandstatistics/fy2022/2022GraduationRates.pdf and https://schools.utah.gov/datastatistics/_datastatisticsfiles_/_reports_/_graduationdropoutrates_/2023GraduationRates.pdfdiplomascontains the number of regular diplomas awarded by year. From NCES CCD until after 2010. Data after 2010 was taken from yearly USBE publications concerning the graduation rate (found from google searches). However, I couldn't find diploma counts for 2013 and 2020 so they were estimated from back-calculating awarded diplomas from the ACGR.grade_8,grade_9,grade_10, andgrade_11contain grade specific enrollment data by year. From NCES CCD.actcontains Utah participant counts for the ACT, mean Utah scores, and mean National scores, all by year. Compiled from ACT Profile Reports (e.g., https://www.act.org/content/dam/act/unsecured/documents/Natl-Scores-2014-Utah.pdf) and from ACT dashboards (e.g., https://www.act.org/content/act/en/research/services-and-resources/data-and-visualization/grad-class-database-2024.html).pace,rank, andgradescontain school-level academic achievement data. Obtained from USBE here: https://www.schools.utah.gov/assessment/resources.
The NCES CCD data can be found at https://nces.ed.gov/ccd/files.asp.
Notes¶
1 https://le.utah.gov/interim/2024/pdf/00003198.pdf
2 https://nces.ed.gov/programs/dropout/ind_04.asp.
3 https://nces.ed.gov/programs/dropout/ind_05.asp.
4 The dip seems centered around the great financial crisis of 2008 but that doesn't offer an explanation for why the rates go down before 2008. One possible explanation could be an increase in Hispanic enrollment beginning in 2005 as Hispanic students have below average graduation rates. More research is needed to confirm this hypothesis.
5 https://nces.ed.gov/programs/digest/d23/tables/dt23_219.10.asp
6 A publication by ACT Inc. noting this change in multiple states: https://www.act.org/content/dam/act/unsecured/documents/Statewide-Adoption.pdf
7 Proficiency is refered to as "Achievement" in the USBE assessment technical manual. Page 16 of 2024AccountabilityTechnicalManual.pdf. Accessed on Oct 15, 2024 from https://www.schools.utah.gov/assessment/resources.
8 Page 17 of 2024AccountabilityTechnicalManual.pdf. Accessed on Oct 15, 2024 from
https://www.schools.utah.gov/assessment/resources.
9 See footnote on page 19 of 2024AccountabilityTechnicalManual.pdf. Accessed on Oct 15, 2024 from
https://www.schools.utah.gov/assessment/resources.